Analysing ParCor and its Translations by State-of-the-art SMT Systems

نویسندگان

  • Liane Guillou
  • Bonnie L. Webber
چکیده

Previous work on pronouns in SMT has focussed on third-person pronouns, treating them all as anaphoric. Little attention has been paid to other uses or other types of pronouns. Believing that further progress requires careful analysis of pronouns as a whole, we have analysed a parallel corpus of annotated English-German texts to highlight some of the problems that hinder progress. We combine this with an assessment of the ability of two state-of-the-art systems to translate different pronoun types.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical Syntax for Statistical Machine Translation

Statistical Machine Translation (SMT) is by far the most dominant paradigm of Machine Translation. This can be justified by many reasons, such as accuracy, scalability, computational efficiency and fast adaptation to new languages and domains. However, current approaches of Phrase-based SMT lacks the capabilities of producing more grammatical translations and handling long-range reordering whil...

متن کامل

Neural Machine Translation Advised by Statistical Machine Translation

Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years. However, recent studies show that NMT generally produces fluent but inadequate translations (Tu et al. 2016; He et al. 2016). This is in contrast to conventional Statistical Machine Translation (SMT), which usually yields adequate but non-fluent translations. It is natural, th...

متن کامل

Tackling Close Cousins: Experiences In Developing Statistical Machine Translation Systems For Marathi And Hindi

In this paper we present our experiences in building Statistical Machine Translation (SMT) systems for the Indian Language pair Marathi and Hindi, which are close cousins. We briefly point out the similarities and differences between the two languages, stressing on the phenomenon of Krudantas (Verb Groups) translation, which is something Rule based systems are not able to do well. Marathi, bein...

متن کامل

An Efficient Phrase-to-Phrase Alignment Model for Arbitrarily Long Phrase and Large Corpora

Most statistical machine translation (SMT) systems use phrase-to-phrase translations to capture local context information, leading to better lexical choices and more reliable word reordering. Long phrases capture more contexts than short phrases and result in better translation qualities. On the other hand, the increasing amount of bilingual data poses serious problems for storing all possible ...

متن کامل

WSD for n-best reranking and local language modeling in SMT

We integrate semantic information at two stages of the translation process of a state-ofthe-art SMT system. A Word Sense Disambiguation (WSD) classifier produces a probability distribution over the translation candidates of source words which is exploited in two ways. First, the probabilities serve to rerank a list of n-best translations produced by the system. Second, the WSD predictions are u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015